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I.  INTRODUCTION 


Consider  the  discrete-time  linear  time-invariant  stochastic  system 


*.°+i  =  AXt  +  ^°+i 

*o°  =  f  t  =  0,1, ...  (1.1) 

Yt  =  HX°  +  Vt°+1. 

where  the  matrices  A  and  H  are  of  dimension  n  x  n  and  n  x  k,  respectively.  This  system  is 
defined  on  some  underlying  probability  triple  (fi,F,  P)  which  carries  the  772”-valued  plant  process 
{X° ,  £  =  0,1,...}  and  the  LR^-valued  observation  process  {Yt,  £  =  0,1,...}.  Throughout  we  make 
the  following  assumptions  (A.1)-(A.3),  where 

(A.l):  The  process  {(W7^,  Vf+\),  £  =  0, 1, . . .  }  is  a  stationary  zero-mean  Gaussian  White  Noise 
(GWN)  sequence  with  covariance  structure  T  given  by 


r  :=  Cov 


£  =  0,1,...  (1.2) 


(A. 2):  The  initial  condition  £  has  distribution  F  with  finite  first  and  second  moments  fi  and  A, 
respectively,  and  is  independent  of  the  process  ((W(0+],  V°+\)i  7  =  0,1,...},  and 
(A.3):  The  covariance  matrices  and  A  are  positive  definite. 

For  each  £  =  0,1,...,  we  form  the  conditional  mean  A<+1  :=  E[X°+1\Y0,Yx, . . .  ,Yt\  or 
MMSE  estimate  of  A°+1  on  the  basis  of  {Y0,  Y\, . . . ,  Yt}.  In  general,  Xt+i  is  a  non-linear  func¬ 
tion  of  {Yo,Yi,.. .  ,Yt},  in  contrast  to  the  LMSE  or  Kalman  estimate  of  X°+1  on  the  basis  of 
{Yo>Ej, . . . , Yt},  which  is  by  definition  linear,  and  which  we  denote  by  W/fj.  For  each  £  =  0,1,..., 
we  can  then  calculate  et+\  :=  E[||Af+i  —  X/^1||2]  which  is  an  L 2  measure  of  the  agreement  between 
the  MMSE  and  LMSE  estimates  of  X°  on  the  basis  of  {Yo,Yi,...  ,Yt}. 

The  goal  of  this  paper  is  to  study  the  asymptotic  behavior  of  et  as  the  time  parameter  £  tends 
to  infinity.  Noting  the  dependency 


€t  =  et((A,H,T),F),  £=1,2,...  (1.3) 

we  find  it  natural  to  parametrize  our  analysis  of  the  asymptotics  of  in  terms  of  the  system  triple 
( A,H,T )  and  of  the  initial  distribution  F.  Of  course,  if  F  is  a  Gaussian  distribution,  the  LMSE 
and  MMSE  estimates  coincide  and  et  =  0  for  all  £  =  1,2, .. .  and  any  system  triple  (A,H,T). 
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We  are  interested  in  characterizing  the  limit  of  the  error  sequence  {e(,  t  =  0,1,...}  and  in 
obtaining  the  corresponding  rate  of  convergence.  In  particular,  we  seek  conditions  under  which 
the  convergence  lim*c*  =  0  takes  place,  and  investigate  the  form  of  the  corresponding  rate  of 
convergence  and  its  dependence  on  the  initial  distribution  F.  Of  special  interest  is  the  situation 
where  exponential  rates  of  convergence  are  available,  i.e.,  limt^-loge*  =  —  I  for  some  I  >  0. 

To  the  authors’  knowledge,  no  results  have  been  reported  in  the  literature  to  study  the  asymp¬ 
totics  of  et  for  a  general  non-Gaussian  initial  distribution.  Such  a  lack  of  results  may  be  explained 
in  part  by  the  fact  that  the  key  representation  result  of  Theorem  1  has  been  derived  only  relatively 
recently  (although,  see  [*]).  In  any  case,  the  work  reported  here  provides  a  formal  justification  to 
the  idea  widely  held  by  practitioners  that  short  of  first  and  second  moment  information,  precise 
distributional  assumptions  of  the  initial  condition  can  be  dispensed  with  when  estimating  the  state 
X°+1  on  the  basis  of  {Yo,  Yi, . . . ,  Yt}. 

The  organization  of  this  paper  is  as  follows.  In  Section  II  we  summarize  a  representation  result 
for  {et,  t  —  0, 1, . . .}  which  constitutes  the  basis  for  the  analysis  presented  here.  In  Section  III,  we 
investigate  the  asymptotic  behavior  of  { et ,  t  =  0, 1, . . .}  for  a  general  multivariable  system;  this  is 
followed  in  Section  IV  by  a  more  complete  analysis  of  the  scalar  case  when  n  =  k  =  1. 

The  following  notation  is  used  throughout.  Elements  of  IRn  are  viewed  as  column  vectors  and 
transposition  is  denoted  by  '.  For  any  positive  integers  m  and  n,  we  denote  by  Mnxm  the  space  of 
n  x  m  real  matrices  and  by  Qn  the  cone  of  n  x  n  nonnegative  definite  matrices.  For  each  positive 
integer  n,  let  In  and  On  be  the  unit  and  zero  elements  in  Mnxn •  Also,  for  any  matrix  K  in  Mnxn, 
we  define  sp(A')  as  the  set  of  all  eigenvalues  of  K,  and  set 

Amin(A')  :=  min{|A|  :  A  G  sp(Ar)}  (1.4a) 

and 

Amax(A')  :=  max{|A|  :  A  e  sp(A')}.  (1.46) 

We  let  Sn  be  the  convex  set  of  square- in tegr able  probability  distributions  functions  on 
( IRn,B(IRn ))  and  we  define  Vn  as  the  collection  of  those  distributions  in  £n  with  zero- mean. 
Finally,  for  each  matrix  R  in  Qn,  Gr  denotes  the  distribution  of  an  IRn-valued  Gaussian  RV  with 
zero  mean  and  covariance  R. 

II.  A  REPRESENTATION  RESULT 

The  basis  for  our  analysis  is  a  representation  result  for  the  sequence  {e*,  t  =  0, 1, . . .}  obtained 
in  [  ].  However,  before  stating  this  result,  we  find  it  useful  to  observe  that  there  is  no  loss  in 
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generality  in  assuming  E{£]  =  0.  Indeed,  with  the  notation 

X?:=X?-$(t,  0)/x  and  Yt  :=  Yt  -  H*(t,Q)n,  t  =  0,1,...  (2.1) 

we  see  that  the  RV’s  {i“,  t  =  0, 1,. . .}  and  {Yt,  t  =  0, 1,. . .}  obey  the  dynamics 


*t+i  =  AXi  +  Wt°+1 

=  i  t  =  0,1,...  (2.2) 

Yt  =  HXt  +  Vt°+1 


where  the  RV  £  ^  satisfies  the  zero-mean  condition  E[£]  =  0.  If  E[A\B)  denotes  the  LMSE 

estimate  of  A  on  the  basis  of  B  for  any  square-integrable  random  vectors  A  and  B,  we  conclude 
from  basic  principles  that  for  each  t  =  0, 1, ... , 


E[X?+1\Y0,Yly...,  Yt)  =  E[X°t+l  |  Y0,  Yx , . . . ,  Yt] 

=  &{X!+1\Yo,Y1,...,Yt)-Ati* 

and 

E[X°+l  |*b ,  Yi ,  •  •  • ,  Yt)  =  E[X?+1 1 Y0 ,  Yx , . . . ,  Yt] 

=  X{U-A^ 

so  that 

Xt+1  -  Mil  =  E[X°t+ 1  |Y0,Yi,...,yt]  -  £[i(°+1 1 Y0 , , . . . , %) . 

Consequently,  for  any  distribution  F  in  £n  and  any  triple  (A,H,  T),  the  relation 

tt((A,H,T),F)  =  et  (( A,H,T),F ) 
holds  where  F  is  the  element  of  Vn  given  by 

F(x)  F(x  -  n),  x  G  IRn 


(2.3) 


(2.4) 


(2.5) 


(2.6) 


(2.7) 


and  we  may  thus  restrict  our  attention  to  those  distributions  F  in  Vn. 

We  now  can  state  the  needed  representation  result,  the  proof  of  which  is  found  in  [4]. 

Theorem  1.  Define  the  Qn-valued  sequence  { Pt ,  t  =  0, 1,. . .  }  by  the  recursions 
Pt+{  =  APtA'  -  [ APtH '  +  S wv][HPtH'  +  +  £u"f  +  Ew 

t  =  0,1,...  (2.8) 

Po  =  On 
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and,  for  convenience,  introduce  the  Qk-valued  sequence  {Jt,  t  =  0,1,...},  where 


Jt  :=  HPtlF  +  Ev. 


Let  the  deterministic  sequences  { Q *,  t  =  0,1,...}  and  {R-t, 
respectively,  be  defined  recursively  by 

Q*t+1  =  [A  -  [APtW  +  E™] Q\ 

Qo  =  In, 

and 

R*+1  =  R*  +  Qt'H'Jf'HQ* 

Ro  =  On. 


t  =  0,1 .  (2.9) 

0,1,...}  in  Ad nxn  nnd  Qn , 


*  =  0,1,...  (2.10) 


*  =  0,1,...  (2.11) 


Then  the  representation 


U+i  = 


llgj+i  J/r-  {z  ~  iRt+ 1  +  ^  *]  lb)  exp[2'6  -  \z'  R*i+xz)dF{z) ||' 


iR" 


fm "  exp[z'b-  \z'R*JrXz]dF{z ) 


dGRUi(b)  (2.12) 


holds  true  for  each  t  —  0, 1, . . . . 

In  order  to  simplify  the  expression  (2.12),  we  define  the  mapping  IF  :  Mnxn  x  Q„  - +  IR 
parameterized  by  the  initial  distribution  F  by  setting 


If(K,R):=  f 

J  [Rn 


K  fm„  {z  -  [i?  +  A  *]  exp[z'&  -  \z'  Rz]dF(z) 

Jmn  zxp[z'b  -  \z'Rz]dF{z) 


- dGR(b ) 


(2.13) 


for  all  K  in  Mnxn  and  R  in  Qn.  With  this  notation,  (2.12)  may  be  rewritten  as 


et  =  IF(Q*t,R't).  t  =  1,2,...  (2.14) 

This  representation  clearly  separates  the  dependence  of  et  on  the  system  triple  ( A,Ii,T )  from  the 
dependence  on  the  initial  distribution  F;  the  distribution  F  affects  et  only  through  the  structure 
of  the  functional  IF,  whereas  the  system  triple  and  time  affect  et  only  through  Q*  and  R*. 

Although  (2.14)  provides  a  simple  representation  for  studying  the  asymptotic  behavior  of  et, 
we  still  must  study  the  behavior  of  IF  under  the  joint  asymptotic  behavior  of  {Q*,  t  =  0, 1, . . .} 
and  {R*,  t  =  0,1,.. .}.  To  that  end,  upon  defining  the  mapping  IF  :  Qn  -+  IR  by 

r  r 

rF(R)--=iF(in,R )  (2.i5) 
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for  all  R  in  Qn,  we  observe  the  inequalities 


A mUQ*t'Q*t)rF(R*t)  <  et((A,H,T),F )  <  \mi>x(Q*t'Q*t)rF(R*t).  t  =  1,2,...  (2.16) 

In  effect,  (2.16)  shows  that  we  may  separately  consider  the  asymptotic  behavior  of  { Q *,  i  =  0,1,...} 
and  the  asymptotic  behavior  of  Ip  as  {R\,  t  =  0, 1,. . .}  tends  to  its  limit. 

III.  A  STABILITY  RESULT 

We  now  commence  our  analysis  of  the  asymptotic  behavior  of  {et,  t  =  0,1,...}  in  the  general 
multivariable  case.  We  focus  our  attention  first  on  the  asymptotics  of  {Qj,  t  =  0,1,...}  and 
{R*,  t  =  0,1,. ..},  and  then  study  the  behavior  of  Ip  as  Q*  and  R*  asymptotically  behave  in  a 
well-defined  way.  As  a  first  step,  we  provide  a  stability  criterion  for  the  system  (A,i?,r)  which 
is  strong  enough  to  ensure  that  limtc«  =  0  for  any  initial  distribution  F  in  £n.  If  this  stability 
criterion  is  satisfied,  we  may  then  also  make  several  estimates  of  the  rate  at  which  et  tends  to 
0.  Apart  from  being  interesting  from  an  operational  viewpoint,  these  estimates  on  the  rates  of 
convergence  provide  an  indirect  characterization  of  F  as  follows:  indeed  they  are  independent  of 
the  initial  distribution  F  when  F  is  not  Gaussian,  so  that  if  limte<  =  0  at  a  fast  enough  rate,  then 
F  must  necessarily  be  Gaussian. 

We  first  present  some  additional  notation:  We  introduce  the  matrices  A  and  C  in  Mnxn  and 
Qn  defined  by 

~A:=  A-  'EWV(T,V)~1  H  (3.1) 

and 

C  :=  -  Eu,v(£,,)-1£tm'.  (3.2). 

The  matrices  {K t,  t  =  0, 1, . . .}  in  Mny.n  are  now  defined  by 

Kt  :=  A  -  [APtH  +  Vwv]JrlH  t  =  0,1,...  (3.3) 

and  we  set  :=  lim4  Kt  whenever  this  limit  is  well  defined.  With  this  notation,  we  may  rewrite 
the  recursion  (2.10)  as 

QUi=KtQl  t  =  0,1,...  (3.4) 

The  following  stability  criterion  taken Trom  [1]  is  used  in  what  follows. 

Theorem  2.  If  the  pair  (A,  H )  is  detectable  and  if  the  pair  (A,C)  is  controllable,  then 
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1,  The  matrix  P0 0  :=  lirn(  I\  is  well  defined  and  positive  definite,  and 

2.  The  matrix  :=  A  -  [AP^H'  +  ZWV][H P^H'  +  Y,v]~l  H  is  stable. 

Proof.  It  is  not  difficult  to  verify  that 

=  e[[W°+1  -  E[Wt°+1\Vt°+1}\  [Wf+1  -  E[Wt°+1\ Vi°+1]]']  (3.5) 

so  that  the  matrix  T,™  -  Hwv(f£v)~1Hvw  }s  symmetric  non-negative  definite,  and  its  square  root  is 
well  defined  [2,  Secs.  VIII.6  and  VIII. 7].  Claim  1  is  Appendix  1  in  [1]  and  claim  2  is  Theorem  5.1 
in  [1].  | 

Because  of  its  importance,  we  list  the  assumption  of  Theorem  2  as  the  following  key  condition 
(C.l),  where 

(C.l):  The  pair  ( A,H )  is  detectable  and  the  pair  (A,C)  is  controllable. 

Theorem  2  implies  the  following  results  concerning  {<2*,  /  =  0,1,...}  and  {R*,  t  =  0,1,...}. 

We  first  observe  from  (2.11)  that  0  <  <  R*+i  for  all  i  =  0,1, - Consequently,  R =  lim;  R*t 

is  always  well  defined  and  non-negative  definite,  although  possibly  infinite,  with 

t-i 

R*  =  £  Q:'H'[HPsH'  +  1  #  *=1,2,...  (3.7) 

s= 0 

Theorem  3.  Assume  the  criterion  (C.l)  to  he  satisfied. 

1.  We  have  lim*  Q*  —  0  with 

limsup  ylnAmax(Q*'Q*)  <  21n  Amax(A'00)  <  0,  (3.8) 

t  t 

and  if  the  matrix  Kt  is  invertible  for  each  t  =  0,1,...,  then 

liminf  jlnAmin(Qt*'Qn  >  2 In  Amln(A'00).  (3.9) 

2.  Moreover,  R^  limj  R*  is  well  defined  and  finite. 

Proof.  By  Theorem  2,  K =  limt  Kt  exists  and  is  stable  if  (C.l)  is  satisfied.  Claim  1  is  now  a 
consequence  of  the  stability  of  and  Appendix  B  of  [3].  To  obtain  the  second  claim,  we  note 
from  (3.7)  that 

1  oo  1 

o  <KU<  T-rF^  I>«) W ,*)  (3.10) 

>  t= 0 
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and  the  finiteness  of  R ^  follows  from  Claim  1.  g 

For  ease  of  exposition,  we  define  the  mapping  <f>  :  IRn  X  IRn  X  Qn  — ►  R+  U  {00}  by 


< R)  :=  expjz'6  —  — z'Rz ] 


(3.11) 


for  z  and  b  in  IRn  and  R  in  Qn,  and  we  set 


$(&;£):=  [  <j)(z,  b;  R)dF(z). 

J  JR“ 


(3.12) 


A  family  of  probability  measures  Fb r  on  IRn,  parametrized  by  b  in  IRn  and  R  in  Qn ,  is  now 
introduced.  Each  probability  measure  in  this  family  is  absolutely  continuous  with  respect  to  F 
with  Radon-Nikodym  derivative  given  by 


dF, 


b,R 


dF 


(z)  ■■= 


(f>(z,  6;  R)/$(6;  R)  if  $(&;  R)  <  00 
1  if  $(&;  R)  —  00 


(3.13) 


for  all  z  in  IRn .  With  this  notation,  the  function  Ip  can  be  expressed  as 


If{K,R)  —  f  k[  {z-[R  +  A-1}-1b}dFb,n(z) 
J  /R"  J  ZR" 


$>(b;  R)dGfi(b)  (3.14) 


for  all  Ii  in  MnXn  and  R  in  Qn,  a  form  more  manageable  for  our  calculations. 

Our  first  observation  is  contained  in 

Proposition  1.  For  every  distribution  F  in  Vn,  we  have  limsup tI*F(R*t)  <  00. 

Proof.  From  Jensen’s  inequality,  we  conclude  that 

rF(R*t)  <  MR*t)  t  =  i,2,...  (3.15) 


with 


Jf(R):=  [  /  \\z-[R+  A-'I-HW2  dFbiR(z)$(b-,R)dGR(b). 

J  iRn  J  ntn 

for  all  R  in  Qn.  The  definition  of  Fbin  and  Tonelli’s  theorem  imply 

f 

Jf(R)  =  f  f  ||z  -  [R  +  A_1]_16||2  exp[z'b]dGR(b)  exp[- -z'Rz}dF(z), 

J  mn  Ui R"  J  ^ 


(3.16) 


(3.17) 
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where  the  inner  integral  may  be  directly  evaluated  by  using  standard  results  on  Gaussian  RV’s. 
After  some  tedious  calculations,  we  find  that 


(3.18) 


Jf(R )  -  tr{[R+ A~1]~1R[R  + A-1}-1} 

+  /  z'  A~X[R  +  +  A~l]~l  A"1  zdF{z) 

J  ZR" 

for  every  R  in  Qn.  Since  R*t  is  nonnegative-definite  and  >  R*  for  t  =  0, 1,. . . ,  we  have  that 

limsupt  Jf(R* )  <  °°  (3.19) 


which,  together  with  (3.15),  concludes  the  proof.  | 

Note  that  in  Proposition  1,  we  did  not  impose  the  requirement  that  R^  be  finite. 

Collecting  what  we  have  discovered  so  far,  we  obtain  the  following  result. 

Theorem  4.  Assume  the  condition  (C.l)  to  hold.  For  any  square-integrable  distribution  F , 


and 


limt  et((A,H,T),F)  =  0 


(3.20a) 


limsupt  ylog  et((A,H,T),F)  <  2  log  Amax(Aco)  <  0. 


(3.206) 


Proof.  It  suffices  only  to  show  (3.20b)  since  it  implies  (3.20a).  For  distributions  F  in  Vn,  (3.20b) 
follows  immediately  from  (2.16),  Theorem  3  and  Proposition  1.  To  extend  the  results  to  distribu¬ 
tions  F  in  £n,  we  use  the  transformation  (2.6)-(2.7).  | 

Whereas  Theorem  4  establishes  an  upper  bound  on  the  rate  at  which  Q  decays  to  0  if  (C.l)  is 
satisfied,  we  now  show  lower  bounds  for  this  same  rate.  These  lower  estimates  require  the  following 
condition  (C.2)  on  the  nonnegative-definite  matrix  R^,  namely 
(C.2):  The  matrix  R^  is  positive  definite. 

We  then  have  the  following  proposition. 


Proposition  2.  If  the  distribution  F  is  in  Vn  and  0  <  R ^  <  oo,  then  F  is  necessarily  Gaussian 
if  iim  inf  tTF(R;)  =  0. 

Proof.  First  we  introduce  the  distribution  F  in  Vn  which  is  absolutely  continuous  with  respect  to 
F  and  whose  Radon-Nikodym  derivative  is  given  by 


dF  _  exp  [-f  z'R^z) 

dF  Z  Sm »  exP  dF(zy 


z  e  iRn. 


(3.21) 
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The  moment  generating  N  of  F  is  simply 

N(b):=  [  exp  [z'b]dF(z),  beIRn. 

J  IRn 

We  show  that  if  liming  Ir(R*)  =  0,  then  N  must  satisfy  the  conditions 


VbiV(6)=  [iC  +  A-1]  1  6iV(6) 
N{  0)  =  1 


(3.22) 


(3.23) 


on  IRn,  from  which  we  conclude  that  the  distribution  F  is  Gaussian;  we  shall  use  (3.21)  to  verify 
that  then  F  must  also  be  Gaussian. 

Since  the  matrix  R ^  is  positive  definite,  there  exists  a  finite  T  such  that  for  t  =  T,T  -f  1, . . . 
the  matrix  R*  is  also  positive  definite  and  thus  Gr-  is  absolutely  continuous  with  respect  to  the 
Lebesgue  measure  v  on  IRn.  Applying  Fatou’s  Lemma  to  (2.14),  we  see  from  the  assumption 
lim inf*  Ip(Rt)  =  0  that 


lim  inf/ 


H/^  -  [R*t  +  A~1]~16}  <t>(z,b;  R*)dF(z) 

®(b;  R*) 


(3.24) 


If  0  <  RZe  <  oo,  we  see  that  for  all  6  in  IRn, 


,  «*(»)  >  0 


(3.25) 


and 

Umf$(6;/2J)  =  #(6,J^)>0  (3.26) 

with  the  last  following  by  monotone  convergence.  Combining  (3.25)-(3.26),  we  now  conclude  that 

lim  inf  J  /  {z  -  [iZ*  +  <fi(z,b\  R*)dF(z)  =0  (3.27) 

WJlR" 

for  //-almost  every  6,  or  equivalently 

/  z<k{z,b-,Ko)dF{z)  =  [!£,+  A-1]-^  /  <t>(z,b\  Rl0)dF(z).  (3.28) 

J  mn  J  mn 

Upon  dividing  (3.28)  by  fm„  exp  [~\z' Rl0z)dF(z),  we  obtain  (3.23);  the  technical  details  are 
found  in  [3]. 
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The  unique  solution  of  (3.23)  is 


N(b)  =  exp 


ii'fc  +  A-1] 


be  mn 


(3.29) 


so  that  F  is  Gaussian  with  mean  0  and  variance  [72^  +  A-1]-1.  Since  the  variance  of  F  is  positive 
definite,  we  see  that  F  is  absolutely  continuous  with  respect  to  u  and  therefore  F  must  be  absolutely 
continuous  with  respect  to  v  by  virtue  of  the  mutual  absolute  continuity  of  F  and  F.  We  calculate 
the  density  of  F  with  respect  to  v  by  the  relation 


d£_  d£  d£ 
dv  dF  v 


and  find  after  some  arithmetic  that 


dF 

dVn 


(2)  -  cexp 


--z'A-1  - 


2  6  IRn 


(3.30) 


(3.31) 


i.e.,  that  the  distribution  F  is  Gaussian.  Q 

The  following  is  an  immediate  result  of  this  proposition. 

Theorem  5.  If  the  assumptions  (C.l)  and  (C.2)  are  satisfied,  and  Xmin(Qt'Qt)  >  0  for  t 
sufficiently  large,  then  the  distribution  F  is  in  fact  Gaussian  if 


Amln  {Qt'Qt) 


(3.32) 


Proof.  If  the  distribution  F  in  in  Vn  and  (3.32)  holds,  then  from  the  lower  bound  of  (2.16)  we 
see  that  lim  inf*  Ip(R*)  =  0,  so  necessarily  F  must  be  Gaussian  in  view  of  Proposition  2.  The 
transformations  (2.6)  and  (2.7)  allow  us  to  establish  the  result  for  distributions  in  En.  g 

In  a  similar  manner,  we  may  verify  a  lower  bound  analogous  to  the  upper  bound  of  Theorem 
4. 

Theorem  5.  If  the  assumptions  (C.l)  and  (C.2)  are  satisfied  and  the  matrices  {Kt,  t  =  0,1,...} 
are  invertible,  then 

,  lim inff  j  In  et  ((A,H,T),  F)>  2 In  Amin(/vO0)  (3.33) 

for  all  non-Gaussian  distributions  F  in  £n. 
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IV.  THE  SCALAR  CASE 


We  now  turn  to  the  case  where  n  —  k  =  1.  Recall  the  following  standard  definition  from 
control  theory  [5,  Def.  6.5]: 

A  system  (A,  II)  is  said  to  be  stabilizable  if  all  unstable  modes  are  in  the  controllable  subspace, 
and  make  the  following  definition: 

A  system  ( A,H )  is  said  to  be  marginally  stabilizable  if  all  modes  which  are  not  either  stable 
or  critically  stable  are  in  the  controllable  subspace. 

Our  goal  in  this  section  is  to  verify  the  following  claim. 

Theorem  7.  Assume  n  —  k  —  1.  We  have  the  following  convergence  results: 

1.  If  the  pair  (A,  C)  is  marginally  stabilizable,  lim*et  =  0  for  any  distribution  F  in£\,  whereas 
if  the  pair  ( A,C )  is  not  marginally  stabilizable,  then  the  asymptotic  behavior  of  et  depends 
nontrivially  upon  F  in  . 

Moreover  we  also  have  the  following  estimates: 

2.  If  (A,C)  is  stabilizable,  then  limtc*  =  0  at  an  exponential  rate  independent  of  F  for  F  in 
£\  non-Gaussian  whereas  if  the  pair  {A,C)  is  marginally  stabilizable  but  not  stabilizable,  then  the 
rate  depends  non-trivially  upon  F. 

We  shall  prove  these  results  by  considering  a  number  of  cases.  Since  we  are  working  in  IR,  we 
may  rewrite  (2.8),  (2.9)  and  (2.11)  as 


Pt+i  =  A2  Pt- 


(APtH  +  zwvy 
H2Pt  +  Zv 


Po  =  0, 


Q 


*  _ 

<+i  - 


fAT,v  -  ZWVH\ 

V  iPPt  +  Z'’  ) 


q: 


Q 


# 

0 


=  1, 


*  =  0,1,...  (4.1) 


f  =  0, 1, . . .  (4.2) 


and 


R 


t+ 1 


R*t  + 


H2Pt  + 


R*0  =  0. 


*  =  0,1,...  (4.3) 
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Note  also  that  we  have 


ct  =  (Q*t)2rF(R*t) 


t  =  1,2,...  (4.4) 


for  F  in  V1. 

We  first  observe  a  degeneracy  when  H  =  0. 

Proposition  3.  If  H  =  0,  then  et  —  0  /or  all  t  =  1,2,...  and  a//  distributions  F  in  S\. 

Proof.  If  H  =  0,  then  ii*  =  0  for  ail  t  =  0, 1, . . . ,  so  et  =  0  for  all  t  =  1,2, ...  for  all  F  in  T>i  by 
directly  evaluating  (2.12);  by  translation  the  result  is  true  for  all  F  in  g 

We  could  prove  Proposition  3  more  directly  in  the  case  where  2^  =  0,  for  then  the  sequences 
{X°,  t  =  0, 1,. . .}  and  {Yj,  t  =  0, 1,.. .)  are  independent,  so  the  MMSE  and  LMSE  filters  coincide. 
We  now  consider  the  more  interesting  case  when  H  ^  0.  Note  from  (3.2)  that  (A,  C )  is  controllable 
if  and  only  if  C  ±  0,  i.e.,  if  and  only  if  EWE“  ^  (E™")2.  We  have 

Proposition  4.  If  H  ■£-  0  and  ( A,C )  is  controllable,  then  limte<  =  0  for  all  distributions  F  in  E\ . 
If  A  =  0,  then  et  —  0  for  all  t  and  all  F  in  £\,  whereas  if  A  ^  0,  then 

^  <  0  (4.5) 

for  all  non-Gaussian  distributions  F  in  £\. 

Proof.  If  A  =  0,  then  Q*  —  0  for  all  t  =  0, 1, . . .  by  (4.2).  We  see  from  (4.4)  that  et  —  0  for  all 
t  =  1,2,...  and  all  F  in  whence  et  =  0  for  all  t  —  1, 2, . . .  and  all  F  in  E\.  If  A  ^  0,  then 
I(t  0  and  Q*  ^  0  for  all  t  =  0, 1, . . .,  so  R ^  >  0  from  (4.3)  and  we  may  apply  Theorems  4  and  6 
to  verify  (4.5).  g 

We  next  consider  the  case  when  (A,  C)  is  stabilizable  but  uncontrollable.  We  can  quickly  verify 
by  induction  on  t  that  when  C  =  0,  Pt  =  0  and  Q *  =  A 1  for  all  t  =  0, 1, . . . . 

Proposition  5.  If  H  =  0  and  ( A,C )  is  stabilizable  but  uncontrollable,  i.e.,  |/1|  <  1  and  C  -  0, 
then  linnet  tends  to  zero  with 

lim(  jlne  =  21n  |4|  <  0.  (4.6) 

Proof.  Since  \A\  <  1,  we  have  from  (4.3)  that  0  <  R^  <  oo.  In  view  of  Propositions  1  and  2,  we 
have  that 

0  <  lim inf £  rF(R*t)  <  limsupt  I*F(R*t)  <  oo;  (4.7) 


limt-  lnct  =  2  In  A 


E" 

,7npZT^ 
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we  then  arrive  at  (4.6)  by  means  of  (4.4).  g 

Turning  now  to  the  case  where  (A,  C)  is  not  stabilizable,  we  shall  prove  the  dependencies 
given  in  Theorem  7  by  analyzing  the  asymptotics  of  et  for  two  specific  initial  distributions.  First, 
however,  let  us  verify  a  general  result. 

Proposition  6.  For  any  distribution  F  in  V\,  lim supt  tlp(t)  <  oo  and  therefore  lim tlp{t)  =  0. 

Proof.  Since  the  functional  Ip  is  independent  of  the  system  dynamics  (A,H,  F),  we  may  assume 
for  the  purpose  of  argumentation  that  our  system  is 


Yt  =  Z  +  Vt°+ 1 


*  =  0,1,...  (4.8) 


Here  A  ~  H  —  T,v  —  1  and  £“*  =  T,wv  =  0.  For  this  system  (which  we  note  to  be  marginally 

stabilizable),  Q\  =  1  and  R*  =  t  for  all  t  —  1,2,...,  so  =  Ip(t)  for  t  =  0,1, _  For  all 

t  -  0, 1, . . .,  define  the  linear  estimate  Xt  of  X°  on  the  basis  of  {Fq,  •  •  • ,  Ft}  to  be 


xs  ■=  o. 


*  =  0,1,...  (4.9) 


Using  the  facts  that  Xt  is  a  linear  estimator,  that  X is  the  LMSE  estimator,  and  that  Xt  is  the 
MMSE  estimator,  we  have 


11*1  -  X? Iln  <  11*1  -  *,°||n  +  ll*f  -  x;\\a 

<||^-Xt#||0  +  ||4f (4.10) 
=  2\\Xt  -  Xt\\ n, 


where  ||X||n  :=  [E(X2)]^2  for  any  square-integrable  RV  X.  From  (4.10), 


I*F (*)  =  d  <  4 E 


1  f 

—  Y'v 

t+1  n  S 

5  =  0 


O 

5+1 


*+  1 


*  =  0,1,...  (4.11) 


and  the  claim  is  now  immediate! 


We  shall  now  consider  two  distributions  Fi  and  F2  in  V\. 
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Distribution  F\.  Distribution  F\  admits  a  density  with  respect  to  Lebesgue  measure  A 
on  IR  given  by 


dFi 

dX 


w  =  E 


“‘•/SF p 


exp 


1  (z-fM? 

2  p2 


z  elR  (4.12) 


whgre  p  >  0,  0  <  <  1  for  i  =  l,2,...,n,  Y^=i  =  1>  and  =  0-  We  exclude 

the  case  where  F\  is  actually  Gaussian. 

Distribution  1*2  •  Under  F%,  the  R.V  £  takes  on  a  finite  number  of  values  z\  <  zo  ...  <  zn 
with  probabilities  Pi,P2,  •••,?«  respectively  with  Y^^iPiZi  —  0. 

The  following  two  facts  are  proved  in  [3]. 

Fact  1.  We  have 

^>(f)  =  (pH+  1)2  *  t>0  (4‘13) 

where  limt  oi(i)  =  0  and  K  >0. 
and 

Fact  2.  We  also  have 

rF2(t)  =  1+.j(t).  t>  0  (4.14) 

where  limt  oi(|)  =  0. 

We  now  can  prove  the  rest  of  Theorem  7. 

Proposition  7.  If  H  ^  0  and  (A,C)  is  marginally  stabilizable  but  not  stabilizable,  i.e.,  |A|  =  1 
and  C  =  0,  then  limt  ft  =  0  for  any  distribution  F  in  Z\,  but  the  rate  of  this  convergence  depends 
nontrivially  upon  F  for  F  non- Gaussian. 

Proof.  We  have  under  the  hypothesis  that  et  —  (1  Ylp(t)  for  all  t  =  0,1,...  and  all  F  in 
V i,  the  extension  to  Z\  being  as  before.  By  Proposition  6,  limtCt  =  0;  however,  if  F  =  F\, 
limt  In  (ct/(lnt))  =  -2,  whereas  if  F  =  Fo,  limt  In  (et/(ln  t))  =  —1.  g 

Finally,  we  conclude  with 

Proposition  8.  H  ^  0  and  (A,C)  is  not  marginally  stabilizable,  i.e.,  |A|  >  1  and  0  —  0,  then 
limsuptet  <  oo  for  all  distributions  F  in  S\,  the  asymptotic  behavior  depending  nontrivially  upon 
F  for  F  not  Gaussian. 
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Proof.  It  is  easy  to  verify  that  under  the  hypotheses  on  (A,  H,  T),  lim^R*  =  oo  but  \[rat(Q*t)2  /  R*  = 
£ V(A2  -  1  )/H2.  For  F  inVu  then 

t=  1,2,...  (4.15) 

Applying  Proposition  6,  we  get  limsupje*  <  oo  for  all  F  in  V\,  and  thus  for  all  distributions  F  in 
S\.  However,  if  F  —  F\ ,  lim*  et  —  0,  whereas  if  F  =  F2,  then  lim(  et  =  1.  g 

The  proof  of  Theorem  7  is  complete;  all  the  cases  have  been  considered. 
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